Behaviour Design in Microrobots: Hierarchical Reinforcement Learning under Resource Constraints
نویسندگان
چکیده
In order to verify models of collective behaviors of animals, robots could be manipulated to implement the model and interact with real animals in a mixed-society. This thesis describes design of the behavioral hierarchy of a miniature robot, that is able to interact with cockroaches, and participates in their collective decision makings. The robots are controlled via a hierarchical behavior-based controller in which, more complex behaviors are built by combining simpler behaviors through fusion and arbitration mechanisms. The experiments in the mixed-society confirms the similarity between the collective patterns of the mixed-society and those of the real society. Moreover, the robots are able to induce new collective patterns by modulation of some behavioral parameters. Difficulties in the manual extraction of the behavioral hierarchy and inability to revise it, direct us to benefit from machine learning techniques, in order to devise the composition hierarchy and coordination in an automated way. We derive a Compact Q-Learning method for micro-robots with processing and memory constraints, and try to learn behavior coordination through it. The behavior composition part is still done manually. However, the problem of the curse of dimensionality makes incorporation of this kind of flat-learning techniques unsuitable. Even though optimizing them could temporarily speed up the learning process and widen their range of applications, their scalability to real world applications remains under question. In the next steps, we apply hierarchical learning techniques to automate both behavior coordination and composition parts. In some situations, many features of the state space might be irrelevant to what the robot currently learns. Abstracting these features and discovering the hierarchy among them can help the robot learn the behavioral hierarchy faster. We formalize the automatic state abstraction problem with different heuristics, and derive three new splitting criteria that adapt decision tree learning techniques to state abstraction. Proof of performance is supported by strong evidences from simulation results in deterministic and non-deterministic environments. Simulation results show encouraging enhancements in the required number of learning trials, robot’s performance, size of the learned abstraction trees, and computation time of the algorithms. In the other hand, learning in a group provides free sources of knowledge that, if communicated, can broaden the scales of learning, both temporally and spatially. We present two approaches to combine output or structure of abstraction trees. The trees are stored in different RL robots in a multi-robot system, or in the trees learned by the same robot but using different methods. Simulation results in a non-deterministic football
منابع مشابه
Hierarchical Reinforcement Learning for Spoken Dialogue Systems
This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision P...
متن کاملHierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents
This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...
متن کاملEvaluation of a hierarchical reinforcement learning spoken dialogue system
We describe an evaluation of spoken dialogue strategies designed using hierarchical reinforcement learning agents. The dialogue strategies were learnt in a simulated environment and tested in a laboratory setting with 32 users. These dialogues were used to evaluate three types of machine dialogue behaviour: hand-coded, fully-learnt and semi-learnt. These experiments also served to evaluate the ...
متن کاملTeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
Reinforcement learning is one of the main adaptive mechanisms that is both well documented in animal behaviour and giving rise to computational studies in animats and robots. In this paper, we present TeXDYNA, an algorithm designed to solve large reinforcement learning problems with unknown structure by integrating hierarchical abstraction techniques of Hierarchical Reinforcement Learning and f...
متن کاملUsing Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies
In this paper we present a hybrid system combining techniques from symbolic planning and reinforcement learning. Planning is used to automatically construct task hierarchies for hierarchical reinforcement learning based on abstract models of the behaviours’ purpose, and to perform intelligent termination improvement when an executing behaviour is no longer appropriate. Reinforcement learning is...
متن کامل